174 research outputs found

    eMatchSite: Sequence Order-Independent Structure Alignments of Ligand Binding Pockets in Protein Models

    Get PDF
    © 2014 Michal Brylinski. Detecting similarities between ligand binding sites in the absence of global homology between target proteins has been recognized as one of the critical components of modern drug discovery. Local binding site alignments can be constructed using sequence order-independent techniques, however, to achieve a high accuracy, many current algorithms for binding site comparison require high-quality experimental protein structures, preferably in the bound conformational state. This, in turn, complicates proteome scale applications, where only various quality structure models are available for the majority of gene products. To improve the state-of-the-art, we developed eMatchSite, a new method for constructing sequence order-independent alignments of ligand binding sites in protein models. Large-scale benchmarking calculations using adenine-binding pockets in crystal structures demonstrate that eMatchSite generates accurate alignments for almost three times more protein pairs than SOIPPA. More importantly, eMatchSite offers a high tolerance to structural distortions in ligand binding regions in protein models. For example, the percentage of correctly aligned pairs of adenine-binding sites in weakly homologous protein models is only 4–9% lower than those aligned using crystal structures. This represents a significant improvement over other algorithms, e.g. the performance of eMatchSite in recognizing similar binding sites is 6% and 13% higher than that of SiteEngine using high- and moderate-quality protein models, respectively. Constructing biologically correct alignments using predicted ligand binding sites in protein models opens up the possibility to investigate drug-protein interaction networks for complete proteomes with prospective systems-level applications in polypharmacology and rational drug repositioning. eMatchSite is freely available to the academic community as a web-server and a stand-alone software distribution at http://www.brylinski.org/ematchsite

    EVolver: An optimization engine for evolving protein sequences to stabilize the respective structures

    Get PDF
    Background: Many structural bioinformatics approaches employ sequence profile-based threading techniques. To improve fold recognition rates, homology searching may include artificially evolved amino acid sequences, which were demonstrated to enhance the sensitivity of protein threading in targeting midnight zone templates. Findings. We describe implementation details of eVolver, an optimization algorithm that evolves protein sequences to stabilize the respective structures by a variety of potentials, which are compatible with those commonly used in protein threading. In a case study focusing on LARG PDZ domain, we show that artificially evolved sequences have quite high capabilities to recognize the correct protein structures using standard sequence profile-based fold recognition. Conclusions: Computationally design protein sequences can be incorporated in existing sequence profile-based threading approaches to increase their sensitivity. They also provide a desired linkage between protein structure and function in in silico experiments that relate to e.g. the completeness of protein structure space, the origin of folds and protein universe. eVolver is freely available as a user-friendly webserver and a well-documented stand-alone software distribution at http://www.brylinski.org/ evolver. © 2013 Brylinski; licensee BioMed Central Ltd

    The aqueous environment as an active participant in the protein folding process

    Get PDF
    Existing computational models applied in the protein structure prediction process do not sufficiently account for the presence of the aqueous solvent. The solvent is usually represented by a predetermined number of H2O molecules in the bounding box which contains the target chain. The fuzzy oil drop (FOD) model, presented in this paper, follows an alternative approach, with the solvent assuming the form of a continuous external hydrophobic force field, with a Gaussian distribution. The effect of this force field is to guide hydrophobic residues towards the center of the protein body, while promoting exposure of hydrophilic residues on its surface. This work focuses on the following sample proteins: Engrailed homeodomain (RCSB: 1 enh), Chicken villin subdomain hp-35, n68h (RCSB: 1yrf), Chicken villin sub-domain hp-35, k65(nle), n68h, k70(nle) (RCSB: 2f4k), Thermostable subdomain from chicken villin headpiece (RCSB: 1vii), de novo designed single chain three-helix bundle (a3d) (RCSB: 2a3d), albumin-binding domain (RCSB: 1prb) and lambda repressor-operator complex (RCSB: 1lmb). (C) 2018 The Authors. Published by Elsevier Inc

    Unleashing the power of meta-threading for evolution/structure-based function inference of proteins

    Get PDF
    Protein threading is widely used in the prediction of protein structure and the subsequent functional annotation. Most threading approaches employ similar criteria for the template identification for use in both protein structure and function modeling. Using structure similarity alone might result in a high false positive rate in protein function inference, which suggests that selecting functional templates should be subject to a different set of constraints. In this study, we extend the functionality of eThread, a recently developed approach to meta-threading, focusing on the optimal selection of functional templates. We optimized the selection of template proteins to cover a broad spectrum of protein molecular function: ligand, metal, inorganic cluster, protein, and nucleic acid binding. In large-scale benchmarks, we demonstrate that the recognition rates in identifying templates that bind molecular partners in similar locations are very high, typically 70-80%, at the expense of a relatively low false positive rate. eThread also provides useful insights into the chemical properties of binding molecules and the structural features of binding. For instance, the sensitivity in recognizing similar protein-binding interfaces is 58% at only 18% false positive rate. Furthermore, in comparative analysis, we demonstrate that meta-threading supported by machine learning outperforms single-threading approaches in functional template selection. We show that meta-threading effectively detects many facets of protein molecular function, even in a low-sequence identity regime. The enhanced version of eThread is freely available as a webserver and stand-alone software at http://www.brylinski.org/ethread. © 2013 Brylinski

    Exploring the dark matter of a mammalian proteome by protein structure and function modeling

    Get PDF
    Background: A growing body of evidence shows that gene products encoded by short open reading frames play key roles in numerous cellular processes. Yet, they are generally overlooked in genome assembly, escaping annotation because small protein-coding genes are difficult to predict computationally. Consequently, there are still a considerable number of small proteins whose functions are yet to be characterized.Results: To address this issue, we apply a collection of structural bioinformatics algorithms to infer molecular function of putative small proteins from the mouse proteome. Specifically, we construct 1,743 confident structure models of small proteins, which reveal a significant structural diversity with a noticeably high helical content. A subsequent structure-based function annotation of small protein models exposes 178,745 putative protein-protein interactions with the remaining gene products in the mouse proteome, 1,100 potential binding sites for small organic molecules and 987 metal-binding signatures.Conclusions: These results strongly indicate that many small proteins adopt three-dimensional structures and are fully functional, playing important roles in transcriptional regulation, cell signaling and metabolism. Data collected through this work is freely available to the academic community at http://www.brylinski.org/content/databases to support future studies oriented on elucidating the functions of hypothetical small proteins. © 2013 Brylinski; licensee BioMed Central Ltd

    Local alignment of ligand binding sites in proteins for polypharmacology and drug repositioning

    Get PDF
    © Springer Science+Business Media LLC 2017. The administration of drugs is a key strategy in pharmacotherapy to treat diseases. Drugs are typically developed to modulate the function of specific proteins, which are directly associated with particular disease states. Nonetheless, recent studies suggest that protein-drug interactions are rather promiscuous and the majority of pharmaceuticals exhibit activity against multiple, often unrelated proteins. Certainly, the lack of selectivity often leads to drug side effects; on the other hand, these polypharmacological attributes can be used to develop drugs acting on multiple targets within a unique disease pathway, as well as to identify new targets for existing drugs, which is known as drug repositioning. To support drug development and repurposing, we developed eMatchSite, a new approach to detect those binding sites having the capability to bind similar compounds. eMatchSite is available as a standalone software and a webserver at http://www. brylinski.org/ematchsite

    Elucidating the druggability of the human proteome with eFindSite

    Get PDF
    © 2019, Springer Nature Switzerland AG. Identifying the viability of protein targets is one of the preliminary steps of drug discovery. Determining the ability of a protein to bind drugs in order to modulate its function, termed the druggability, requires a non-trivial amount of time and resources. Inability to properly measure druggability has accounted for a significant portion of failures in drug discovery. This problem is only further exacerbated by the large sample space of proteins involved in human diseases. With these barriers, the druggability space within the human proteome remains unexplored and has made it difficult to develop drugs for numerous diseases. Hence, we present a new feature developed in eFindSite that employs supervised machine learning to predict the druggability of a given protein. Benchmarking calculations against the Non-Redundant data set of Druggable and Less Druggable binding sites demonstrate that an AUC for druggability prediction with eFindSite is as high as 0.88. With eFindSite, we elucidated the human druggability space to be 10,191 proteins. Considering the disease space from the Open Targets Platform and excluding already known targets from the predicted data set reveal 2731 potentially novel therapeutic targets. eFindSite is freely available as a stand-alone software at https://github.com/michal-brylinski/efindsite
    • …
    corecore